Increasing the statistical significance of entanglement detection in experiments 
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Entanglement is often verified by a violation of an inequality like a Bell inequality or an entan- 
glement witness. Considerable effort has been devoted to the optimization of such inequalities in 
order to obtain a high violation. We demonstrate theoretically and experimentally that such an 
optimization does not necessarily lead to a better entanglement test, if the statistical error is taken 
into account. Theoretically, we show for different error models that reducing the violation of an in- 
equality can improve the significance. Experimentally, we observe this phenomenon in a four-photon 
experiment, testing the Mermin and Ardehali inequality for different levels of noise. Furthermore, 
we provide a way to develop entanglement tests with high statistical significance. 

PACS numbers: 03.65 Ud, 03.67 Mn 
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Introduction — Quantum theory is a statistical theory, 
predicting in general only probabihties for experimen- 
tal resuhs. Consequently, in most experiments observing 
quantum effects, several copies of a quantum state are 
generated and individuaUy measured to determine the 
desired probabihties. As only a finite number of states 
can be generated, this leads to an unavoidable statistical 
error. The particularly low generation rate in certain ex- 
periments demands a careful statistical treatment; a fact 
that is well known from particle physics [ll . 

In quantum information processing, many of today's 
experiments aim at the generation of entanglement, 
which is considered to be a central resource [JQ- So 
far, entanglement of up to ten qubits has been achieved 
using trapped ions or photons [sl, @ . For the experimen- 
tal verification of entanglement, often inequalities for the 
correlations — such as Bell inequalities or entanglement 
witnesses — are used Ql, in which a violation indicates 
entanglement. The maximization of this violation has 
been investigated in detail, cf. Refs. i&ci, mak- 

ing such inequalities more sensitive is a crucial step in 
order to allow advanced experiments with more particles. 

In this paper we demonstrate theoretically and exper- 
imentally that such an optimization does not necessarily 
lead to a better entanglement test, if the statistical na- 
ture of quantum theory is taken into account. It was 
already noted 0] that, when aiming at ruling out local 
realism, highly entangled states do not necessarily deliver 
a stronger test than weakly entangled states, but this 
does not answer the question which inequality to use for 
a given state and it remains unclear how to apply it to 
actual error models used in experiments. Also, most of 
the different entanglement detection methods compared 
in Ref. 0] cannot be applied to multiparticle systems. 



Theoretically, we show for different error models that 
decreasing the violation of an inequality can improve the 
significance. Also, we demonstrate this phenomenon in a 
four-photon experiment, measuring the Mermin and the 
Ardehali inequality. We find that the former inequality 
leads to a higher significance than the latter, despite a 
lower violation. Finally, we discuss the physical origin 
of this phenomenon and provide methods to construct 
entanglement tests with a high statistical significance. 

Statement of the problem — A witness W is an ob- 
servable which has a non-negative expectation value on 
all separable states (i.e., states which can be written as a 
mixture of product states, q = X^fc^'^l'^fc' ^k){o,k, bk\ with 
some probabilities pk). Hence, a negative expectation 
value of a witness signals entanglement. Similarly, a Bell 
inequality {B) < Cihv, where ;B is a sum of certain cor- 
relation terms, holds if the measurement outcomes can 
originate from a local hidden variable (LHV) model. As 
separable states allow a description by LHV models, a 
violation of a Bell inequality implies the presence of en- 
tanglement. 

In both cases, we define V as the violation of the cor- 
responding inequality. That is, for a witness we have 
V(W) = -(W) while for a Beh inequality V{B) = 
(B) — Cihv Then, the significance of an entanglement 
test can be defined as 



(1) 



where £ is the statistical error for the experiment. 
Clearly, £ depends on the particular experimental im- 
plementation and on the error model used. Nevertheless, 
in any experiment 5 is a well characterized quantity; its 
notion is widely used in the literature, when the viola- 
tion is expressed in terms of "standard deviations" , also 
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in other fields of physics 

Previously, much effort has been devoted to improving 
entanglement tests in order to achieve a higher violation. 
For instance, for entanglement witnesses a mature theory 
how to optimize witnesses has been developed 0- Here, 
for a given witness W one tries to find a positive operator 
P, such that W = W — P is still a witness. In order to 
have a more significant result, however, one can either 
increase V in Eq. ([T]) or decrease £. It is a central result 
of this paper that decreasing E is often superior. 

Variance as the error — Let us first consider a simple 
model, in which we take the square root of the variance 
as the error of a witness, 



£{W) = A(W) = v/(W2) - 



(2) 



An experimentally relevant model will be discussed be- 
low. This simple model already demonstrates that the 
standard optimization of witnesses is often not the ap- 
propriate approach to increase the significance: 

Observation. Let g = IV') ('01 be a pure state detected 
by the witness W. Then, one can always increase the 
significance of W at the expense of optimality (i.e., by 
adding a positive operator). With this method one can 
make the significance arbitrarily large. 

Namely, one needs to find a positive observable P, such 
that is an eigenstate of W = W + P; then the error 
vanishes. Indeed, such a P can be found (see Appendix). 

Multi-photon experiments — Let us now consider a re- 
alistic situation, in which other and more specific error 
models are used. As our later implementation uses multi- 
photon entanglement, we concentrate on this type of ex- 
periments but our ideas can also be applied to other im- 
plementations, such as trapped ions. 

The basic experimental quantities are the numbers of 
detection events of the different detectors i. From 
these data, all other quantities such as correlations or 
mean values of observables are derived. 

In the standard error model for photonic experiments 
0, ; the counts are assumed to be distributed accord- 
ing to a Poissonian distribution, whose mean value is 
given by the observed value. That is, for a certain mea- 
surement outcome i one sets the mean value as (n^) = Ui 
and the error as £{ni) = (being the standard devia- 
tion of a Poissonian distribution). In general, for a func- 
tion / = f{ni) of several counts, Gaussian error propa- 
gation is applied to obtain the error (see below). 

To give an example, consider a two-qubit correlation 



M = aZiZ2 + (3Zil2 +7li.^2 



(3) 



Here and in the following, Zk (or 1^) denotes the Pauli 
matrix cr^ (or the identity matrix) acting on the fcth qubit 
and tensor product symbols are omitted. {M) can be 
determined by measuring in the common eigenbasis of all 
three terms in Ai, i.e., by projecting onto |00), |01), |10) 
and 1 11). Repeating this with many copies of the state 



will lead to count numbers Uki with fc, / = or 1 and to 
count rates pfci = nu/ntot, where utot = ^loo + noi+riio-f- 
nii is the total number of events. The mean value (A^) 
can be written as a linear combination of pki, namely 
(M) = AooPoo + Aoipoi + AioPio + Anpn with Aqo = 
a + /3 + 7, Aoi = -Q + /3 - 7, Aio = —a - 13 + j, and 
All = a — /3 — 7- Then, according to Gaussian error 
propagation, the squared error is given by |Tl| 
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Let us finally discuss the underlying assumptions of this 
error model. The first main assumption is that the n^i 
arc Poisson distributed and their errors are uncorrelated. 
This is well motivated by the experimental observations. 
Moreover, Gaussian error propagation stems from a Tay- 
lor expansion of the function /. Finally, if one inter- 
prets the standard deviation as a confidence interval, one 
tacitly assumes that the distribution is Gaussian, as for 
other distributions the connection is not so direct. If 
the number of events for all detectors is sufficiently large 
(e.g. Uki ^ 10), however, the Poissonian distribution is 
approximated well by a Gaussian distribution. 

Bell inequalities for four particles — Let us now discuss 
the Mermin and Ardchali inequality as experimentally 
relevant examples. First, wc consider 

Bm = - [^1X2^3^ + perm.] + YiY2Y^Yi, 

(5) 

where the bracket [. . . ] is meant as a sum over all permu- 
tations of X1X2I3I4 that yield distinct operators. For 
states allowing an LHV description, the Mermin inequal- 
ity {Bm) < 4 holds We wrote Bm with the Pauh 
matrices as observables, since they are used later, how- 
ever, one might replace them by arbitrary dichotomic 
measurements . 



Second, we consider the Ardehali inequality {Ba 
2V2\M, where 



< 



Ba = (^^1X2X3^4 + X1X2X3B4 

- [XiY2Y3Ai + perm.] - [XiY2Y:iBi + perm.] 

- [^1X2^3^4 + perm.] + [XiX2Y3Bi + perm.] 
+YiY2Y3A^ - YiY2Y3Bi) /V2. (6) 

Here, the sums in square brackets include all distinct 
permutations on the first three qubits. We set A4 = 

{X4 + Y4) /V2 and P4 = {X4 - F4WV2, but again, the 



0. 



observables can remain arbitrary 

The Mermin and Ardehali inequality reveal the non- 
local correlations of the four-qubit GHZ state. 



\GHZa 



1 



^(10000) + 11111)). 



(7) 



For this state we have (Bm) = (Ba) = 8. As the bound 
for LHV models for the Ardehali inequality is smaller. 
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FIG. 1: Significance S for the Mermin (red, solid) and tlie 
Ardehali inequality (blue, daslied) for bit-flip noise. On the 
horizontal axes, we show the bit-flip probability and the cor- 
responding fidelity with respect to a perfect GHZ state. We 
assumed that the experimenter prepares 8000 instances of a 
GHZ state and chooses either to measure the eight terms of 
the Mermin inequality (each term with 1000 realizations of 
the state) or the 16 terms of the Ardehali inequality with 500 
states per correlation term. See text for further details. 

the violation V is larger. This may lead to the opinion 
that the Ardehali inequality is "better" than the Mermin 
inequality for the state \GHZ4). 

However, this belief is easily shattered, if the signif- 
icance S is considered as the relevant figure of merit. 
This can be seen directly from Eq. The GHZ state 
is an eigenstate for each of the correlation measurements 
in the Mermin inequality (they are so-called stabilizing 
operators of the GHZ state). Hence, if the Mermin in- 
equality for a perfect GHZ state is measured, we have in 
the last term of Eq. ^ for each case k, I either Afe; = (Ai) 
(since the mean value is an eigenvalue) or nki — 0, hence 
£{M) vanishes. The Ardehali inequality, however, does 
not contain stabilizer terms and the error remains finite. 

For an experimental application it is important that 
the Mermin inequality leads to a higher significance than 
the Ardehali inequality, even if noise is introduced (ijj . 
To see this, we considered bit-flip noise, which can eas- 
ily be simulated in experiment. Therefore, we used a 
perfect GHZ state whose qubits are locally affected by 
the bit-flip operation / with probability p, i.e. /(ft) = 
{l—p)gi+pXigiXi for each qubit i. In Fig.[Tl we plotted 
the significance S versus the fidelity F of the noisy state 
w.r.t. a perfect GHZ state, i.e. F = {GHZilQexplGHZi), 
and versus the bit-flip probability p. For F > 0.70 the 
Mermin inequality is more si gnif icant (for the 6-qubit 
versions of these inequalities jl2l [isj , this changes to 
F > 0.40). As can be seen from Eq. the fact that one 
witness is more significant than the other one is indepen- 
dent of the total particle number. Moreover, a calculation 
for white noise yields very similar values {F > 0.72 for 
4 qubits, F > 0.41 for 6 qubits). This suggests that the 
effect does not depend on details of the noise. Note that 
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FIG. 2: Scheme of the experimental setup, a. The setup to 
generate the required four-photon GHZ state. Femtosecond 
laser pulses (~ 200 fs, 76 MHz, 788 nm) are converted to ultra- 
violet pulses through a frequency doubler LiBsOs (LBO) crys- 
tal (not shown). The pulses go through two main /3-barium 
borate (BBO) crystals (2 mm), generating two pairs of pho- 
tons. The observed two-fold coincidence count rates are about 
1.6 X 10" /s with a visibility of 96% (94%) in the H/V (+/-) 
basis, b. Setup for engineering the bit-flip noise, c. The 
measurement setup. 

the threshold value for white noise vanishes exponentially 
fast for an increasing number of qubits. 

Experimental setup — Spontaneous down conversion 
has been used to produce the desired four-photon state 
[see Fig. [H^a)]. With the help of polarizing beam split- 
ters (PBSs), half- wave plates (HWPs) and conventional 
photon detectors, we prepare a four-qubit GHZ state, 
where |0) = \H) (|1) = \V)) denotes horizontal (ver- 
tical) polarization. We have chosen the bit-flip noise 
channel to demonstrate the theory introduced in this pa- 
per. As shown in Fig. [IJb), the noisy quantum chan- 
nels are engineered by one HWP sandwiched with two 
quarter- wave plates (QWPs) [l^ . The HWP is switched 
randomly between +9 and —9 and the QWPs are set 
at 0° with respect to the vertical direction. In this 
way, the noise channel can be engineered with a bit-flip 
probability p = sin^(20). The Pauli matrix measure- 
ments required in the Bell test can be implemented by 
a combination of HWP, QWP and PBS [see Fig. M^c)]. 
The fidelity of the prepared GHZ state is obtained via 
F = i((|0000) (00001 + 11111) (llllD) + ^ (Bm). With- 
out added noise, its value isi^ = 0.84±0.01. 

Experimental results — For different noise levels, the 
experimental results of the violation, the statistical er- 
ror and the significance are shown in Table I. The first 
observation is that, when there is no engineered noise, 
the violation of the Mermin inequality is smaller than 
the violation of the Ardehali inequality. Its significance, 
however, is larger than that of the Ardehali inequality; 
this proves that testing the Mermin inequality is a bet- 
ter choice to characterize the entanglement in this case. 



4 



e 


P 


V{Bm) 


£{Bm) 


S{Bm) 


V{Ba) 


£{Ba) 


S{Ba) 


±0° 





2.37 


0.05 


44.3 


3.65 


0.10 


35.0 


±2° 


0.005 


2.00 


0.06 


33.4 


3.14 


0.11 


29.2 


±4° 


0.019 


1.57 


0.07 


23.7 


2.48 


0.11 


21.8 


±6° 


0.043 


1.13 


0.07 


16.2 


2.05 


0.11 


17.8 


±8° 


0.076 


0.67 


0.08 


8.8 


1.63 
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TABLE I: Experimental values of the violation, the statistical 
error and the significance for different values of 6 (and the cor- 
responding p). V{Bm), £(Bm), S(Bm) represent the values of 
V, £ and S in testing the Mermin inequality; V{Ba), £{Ba), 
S{Ba) represent the corresponding values for the Ardehali 
inequality. Each setting X1X2X3X4 etc. in the Mermin in- 
equality is measured for 800 s, while each setting X1X2X3A4 
etc. in the Ardehali inequality is measured for 400 s. The 
average total count number for each inequality is about 7500. 



Secondly, when the noise level increases, the significance 
in the Mermin inequality decreases more quickly. When 
= ±6°, ±8°, the significance for the Ardehali inequal- 
ity is already larger than that for the Mermin inequal- 
ity. Due to the experimental imperfections, the initial 
state to which the noise is added is not the perfect GHZ 
state. However, assuming an initial state like g{p = 
0) = a|0000)(0000| + /3|1111)(1111| + 7(|0000)(1111| + 
|1111) (00001) + ^1, where a = 0.362,^ = 0.522,7 = 
0.398, A = 0.12 reproduces that for p < 0.019 the Mer- 
min inequality is more significant. 

Discussion — We have proved that it can be favor- 
able to use an entanglement witness or a Bell inequality 
that results in a lower violation. We confirmed this ex- 
perimentally using four-photon GHZ states. Our results 
show that the usual way of optimizing witnesses will not 
necessarily lead to more powerful tools for the analysis of 
many-particle experiments. It is important to note that 
when the number of photons in multi-photon experiments 
is increased the count rates decrease; consequently, the 
statistical error becomes more and more relevant. 

Our results provide a direction to find powerful en- 
tanglement tests for low count rates: the observed effect 
relied on the fact that in the Mermin inequality only sta- 
bilizer measurements were made. There are already pow- 
erful approaches available to construct witnesses from 
stabilizer observables [l7| and also other Mermin-like or 
Ardehali- like Bell inequalities have been explored [isj . 



Consequently, these approaches are promising candidates 
for developing sensitive analysis tools. Further, inequali- 
ties similar to witnesses have been prop osed and used to 
characterize quantum gate fidelities [19[ , which is another 
application of our theory. Finally, we believe that results 
on statistical confidence from other fields of physics (e.g. 
0) can give new insights in advanced experiments on 
quantum information processing. 
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CAS, the IGF at HFNL, and the A. v. Humboldt Foun- 
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Appendix — To prove the Observation, we use as an 
ansatz for the improved witness W' = W + 'fP, where 
7 > and P is a positive observable with unit trace. For 



small 7, we expand — 



A(W) ~ 



(W) 
"T2A3(>V) 



{WP- 



PW) 



■ (W) 



P) 



A(W) 

0(7^) . Maximizing this expression 

over all positive P with Tr(P) = 1 is equivalent to min- 
imizing Tr(QP), where Q = + yVg- 2{yV^)/{W)g. 
Hence the optimal P is a one-dimensional projector P = 
|(p)((^|, where \ip) is an eigenvector corresponding to the 
minimal eigenvalue of Q. We still have to show that this 
minimal eigenvalue is negative. To this end, we make 
the ansatz \ip) = a\ip) + f3\tp^), where (i/'lV'''") = 0. We 
then have to minimize Tr(QP) = 2Rc{a* P{ip\yV\ip-^)) - 



2\a 



12 A|(W) 



We can always choose the phases of a and 



/3 such that Re(. . .) is negative. Therefore the optimal 
IV'''') is the vector orthogonal to which maximizes 

mW\i^^)l i.e., IV'o'pt) = [l-|V')(V'|]W|V)/Av,(W). Fur- 
thermore, we can always choose the moduli of a and /3 
such that the negative term 2 Rc(. . .) dominates the posi- 
tive second term. This shows that the minimal eigenvalue 
of Q is negative. 

For finite 7 we can iterate this procedure. We always 
find the same IV't^t) (though a and /3 will be different 
in each iteration step). Thus, we make the ansatz 7P = 
a\i;){iP\ + 6|V'4t>(^o'ptl + cWii^optl + h.c. for the final 
result of the iteration. If we choose c = — A.^(W), ab > 
|cp, and a,b > 0, then 7P is positive, \ip) is an eigenstate 
of W", and A^,(yV") is zero, so S diverges. □ 
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